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ABSTRACT 


The  rate  of  convergence  of  the  Reset  Davidon  Variable  Metric 
Method  for  minimizing  an  unconstrained  function  of  n  variables  is 
considered.  If  the  limit  point  of  *he  sequence  of  points  generated  by  the 
method  is  a  stationary  point  with  a  positive  definite  Hessian,  the  rate 
of  convergence  is  superlinear  with  respect  to  cycles  of  n  points. 

With  an  additional  Lipschitz  assumption  the  rate  of  convergence  is  shown 
to  be  at  least  quadratic  for  subsets  of  cycles  of  n  points. 


THE  RATE  OF  CONVERGENCE  CF  THE  RESET 
DAVID  ON  VARIABLE  METRIC  METHOD 
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Nl 


Garth  P.  McCormick 

Introduction 

The  Variable  Metric  Method  proposed  by  Davidon  [1J  for  solving 
the  unconstrained  minimization  problem 

minimize  f(x)  < 


where  x  =  (x^,  x  )'  and  feC  is  summarized  as  follows, 


ividon  Variable  Metric  Method  <  DVMM) 


STEP  0;  Let  H  =  some  arbitrary  symmetric  positive  definite  matrix 
(z'H°z  >  0  for  all  z  *  0),  x°  be  some  arbitrary  initial 
starting  point.  Set  s°  =  ~H°g0,  Proceed  as  in  the  general  step 
k+1  at  equation  (8). 


k  +  1;  At  point  x  ,  ( k  =  0,  , , ,  )  let 
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where 


k  , .  ku  J  k  -1 
ft  s  ( ( y  )  1 H  y  } 


A.  i  j  /S.  v  ,  N  «  * 

p  5  ( ( o-  )  y  )  , 


k  k 

g  =  Vf(x  ) .  the  vector  of  first  partial  derivatives 

k  k+1 

of  f(x)  at  x  .  Let  tho  direction  vector  at  x  be  given  by 


k+1  „k+l  k+1 
s  ®  -H  g 


Let  t  be  the  smallest  local  mlnlmlzinc 


of  the  one  dimensional 


programming  problem: 


,  .  ,  k+1  ,  k+1,. 

minimise  f{x  +  s  t) 


k+2  k+1  k+1.  k+1 

x  s  x  +  s  t 

Repeat  the  step  for  x  .  If  at  any  iteration  k  ,  g  =0, 


the  procedure  ceases. 


Fletcher  and  Powell  [  2]  showed  that  the  above  DVMM  minimises 


f(x)  where  f(x)  is  a  positive 


Iratic  form  in  n  steps  or  fewer. 


No  one  has  been  able  to  show*  that  for  a  general  function  f(x), 


limit  points  of  {x  }  are  stal 


{ points  where  the  vector  of 


first  derivatives  vanishes).  Aiso  given  that  a  limit  point  x  is  a 


The  author  has  just  received  a  copy  of  a  paper  by  Powell  [  5]  who  proves  that 
the  DVMM  converges  to  a  stationary  point  if  f(x)  is  a  twice  differentiable 
convex  function  whose  Hessian  matrix  has  eigenvalues  bounded  below  away  from 
zero,  Powell  also  shows  that  the  rate  of  convergence  is  superlinear  every  step 
for  this  case  if  an  additional  Lipschitz  condition  is  placed  on  the  second 
derivatives  of  f(x). 


II 


*  I 


stationary  point,  no  ono  has  been  able  to  ascertain  the  rate  at  which  the 
y 

sequence  {x  }  converges  to  x' .  Below  is  given  a  simple  revision  of 
the  DVMM  called  the  Reset  Davidon  Variable  Metric  Method  (RDVMM) . 

For  this  revised  algorithm  a  statement  of  convergence  is  given  in  Theorem  1, 
and  proofs  of  the  rate  of  convergence  in  Theorems  2  and  3,  In  the 
discussion  following  Theorem  3  the  differences  between  the  convergence 
and  rate  of  convergence  of  the  RDVMM  and  the  original  DVMM  are  given. 


Reset  Davidon  Variable  Metric  Method  I  RDVMM) 

In  step  k  +  1,  if  (k  +  1)  a  0  mod(n  +  1) ,  then  set 

Hk+1  =  H°  (a  symmetric  positive  definite  matrix) .  (10) 

Then  equations  (4)  -  (6)  are  bypassed. 

It  is  useful  here  to  state  without  proof  some  properties  which  apply 
to  both  algorithms. 


For  all  k,  H* 

is  a  positive  definite  matrix. 

(id 

The  direction  vector  s  =  0  if  and  only  if  g  =0. 

(12) 

Unless  sk  =  0, 

,,  k+1.  „  k. 

f(x  )  <  f(x  )  . 

(13) 

Because  of  (  8 ) 

> 

y  and  the  fact  that  f  €  C\ 
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.  k+1  ,  k  _ 
(g  )  a  =  0, 


(14) 


I.q.  the  gradient  of  each  point  in  the  sequence  is  orthogonal  to  the  previous 
direction. 

If  ftC  ,  using  Taylor's  theorem,  for  j  =1,  .  n, 
n 


y'l  «  Jj  {92Hr\]",i)/bx  dx  )<r  *  , 
J  1=1  1  ; 


(15) 


k.  j  k  K4*l 

where  each  q  ’  is  a  convex  combination  of  x  and  x  .  It  is  convenient 

A  k.  tli 

to  define  a  matrix  G(ii  )  whose  ij  element  is 


02f(r|k,i)/ex  dXj  . 


(16) 


Then  the  equations  expressed  by  (15)  can  be  summarized  as 


k  k.  k 
y  a  G(q  )<r 


(17) 


Furthermore,  because  of  (8) ,  for  j  =  1,  . . . ,  n 


f(xk+i)  <f(nk,J)  <f(xk) 


(18) 


where  either  equality  holds  in  (18)  only  if 


k+1  k,j  k  kj 

x  =  n  or  x  =  r,  * 


The  formula  for  t  ,  using  (2),  (7),  (9),  (1+)  and  (17)  is 


k  ,  k.  1T,k  k/f  Ttk*,  k.Ttk  k 
t  =  (g  )  'H  g  /(g  )  'II  G(t)  )H  g  . 


(19) 
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For  the  RDVMM  It  Is  useful  to  divide  the  sequence  {x  }  into  overlapping 
groups  of  (n  +  2)  points.  The  subscript  c  will  be  added  to  indicate  the 
group  (c  »  1,  . . . ) ,  and  the  superscript  k  will  be  used  to  indicate  the 
order  within  the  group  ( k  =  0,  1,  n+l). 

The  last  point  of  each  group  is  the  same  a3  the  first  point  of  the 
next  group.  The  sequence  then  looks  like 


u  u*ri 

xi  ■  *  ■  *  <  xi 


x2*  *  ‘  ’  >  x2 


'i 


jp- 


1 

( - 
t 

1  ■ 
1 


It  is  now  possible  to  state  a  convergence  theorem  about  the  RDVMM. 


L 

IT 

r. 

» 

* 

[•* 


^[Convergence  of  RDVMM]. 

1  *  0 
Assume  f «  C  .  If  x  is  a  limit  point  of  {x  }  generated  by 

c 


the  RDVMM,  then 


.1 

i  i| 

if 
f  I 


*  _  $ 
g  a  Vf(x  )  *  0. 


Proof:  The  proof  follows  from  (10),  (7),  (8),  (9),  and  the  arguments  used 


In  [  A,  Theorem  1].  Q.E.D. 


. 
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Well-known  noceaaorv  conditions  that  x  be  a  local  unconstrained 
minimum  of  fix)  are  that  it  be  a  stationary  point,  (l.e.  that  (Hi)  hold) 
and  that  (if  f  c  G2) . 


be  a  positive  semi -definite  matrix. 

ffuffiolent  conditions  that  x  an  on  isolated  local  unconstrained 
minimum  are  that  (  21)  hold  and  that 

Sl 

G  be  a  positive  definite  matrix.  (22) 


It  la  for  a  limit  point  satisfying  (21)  and  (22)  that  rate  of 
convergence  can  be  determined. 

Theorem  2  [  Superlinear  Convergence] . 

Ifi  (1)  fcC2, 

(2)  the  RDVMM  is  applied  to  problem  (1), 


then: 


(a)  every  limit  point  x  of  {x^}  lb  «  stationary  point. 

c 

If,  in  addition, 

s< 

(3)  G  is  a  positive  definite  matrix, 

(b)  x  is  the  unique  point  of  aooummulation  of  {x  },  and 


(c)  with  respect  to  the  grouping  (  20) , 


Ill'll 


i  n  *  it 
lxc  '  “  !i 


C  —  CC  (|  0  *. 

II X  -  X  li 

c 


(23) 


l.e.  ,  cou vergcuct*  is  superlincnr  with  respect  to  evory  n  points  within  a 


group. 


Proof:  Part  (a)  is  just  Theorem  1,  part  (b)  follows  from  (8)  and  (22). 
The  proof  of  part  (o)  will  consist  of  o  scries  of  assertions. 


There  are  two  numbers  and  such  th«_t  If  \  is  an 


eigenvalue  of  H  ,  and  k  Is  large. 


0  <  or^  <  \  , 


also 


iilT  all  <  *2llsl| ,  for  all  s. 


(24) 


(25) 


This  follows  because  the  third  term  of  (4)  can  be  written 

o-kf  ( o  k) '  G(tik)tr  l(ak)  and  in  a  neighborhood  of  x*,  the  el  >  on values 
*  k 

of  G(g  )  are  strictly  positive  and  bounded  below  away  from 

* 

In  a  neighborhood  about  x  there  are  two  finite  positive  numbers 
such  that  for  y  in  that  neighborhood. 

II y  -  x*||  <  !lg(y)  II  <  o?  lly  -  x*||  f  (26) 


where  g(y)  a  Vf(y).  This  Is  proved  using  a  Taylor's  expansion  on  y(y),  the 
stationarity  of  x  and  the  positive  definiteness  of  G  , 


viola 
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In  a  neighborhood  about  x  ,  there  is  a  positive  value  a 
for  y,  z  in  that  neighborhood,  if  f(y)  <f(z),  then 


such  that 


II 


y  - 


(  27) 


The  proof  follows  like  that  in  (  26) . 

jJ; 

In  a  neighborhood  about  x  there  are  values  and  such 

that, 

0  <  on  <  tk  <  <*12  .  (28) 


This  follows  directly  from  (19),  (24),  (  25),  and  the  positive  definiteness 
* 

of  G  . 

In  a  neighborhood  about  x  ,  there  are  two  values  «^0,  such 

that, 

•  (29) 

9  IUkll  10 

This  follows  directly  from  (7),  (24),  and  (25), 

♦ 

In  a  neighborhood  of  x  there  is  a  value  such  that  if  y,  z 
are  in  that  neighborhood,  and  f ( y )  <  f(z),  then 


llg(y)  II  <  Hg(z)  ll  • 


(30) 


In  a  neighborhood  of  x  there  are  values  o^,  such  that 

ii  k  ii 


0  <  or,  ,  < 


*  <  a. 


14  -  „  k  ,i  -“15 

V1 


(31) 


This  is  easily  proved  using  (17)  and  the  positive  definiteness  of  G  . 
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Let 


iixj;-x*i 

V  =  ^  SUP  To - 

C  -*•»  II  x  -  X  | 
c 


(Because  of  (27) ,  y  <  +  ». )  To  prove  part  (c)  we  need  to  show  that  v  =  °* 
Let  I*  be  an  ordered  subset  of  integers  where 


II  n  *i 

||  x  -  x  | 

V  =  lim  — ^  ~ 

c  -*°o  II X  -  X  | 

*  ll  c 


J 


6  =  lim  inf 


c  *-*•«> 

cell 


Let  I2  be  an  ordered  subset  (of  I*)  such  that 


6  =  lim  inf 


=  lim  sup 
c~*  00 
Cd2 


There  are  two  cases  to  consider. 

(i)  Suppose  6=0.  Then  because  of  (26),  the  fact  that  f(xn)  <  f(xn  ),  and  (27), 

c  c 

llx"-x*|| 

lim  —2 - —  »  0  .  (34) 

c-»  ||x°-x*|| 

o 


Hence  y  =  0  for  case  (i). 


(ii)  Suppose  6  >  0.  Here  a  modified  induction  proof  is  required.  In 

2 

the  following  it  is  assumed  that  only  c1  s  in  I  are  under  consideration. 


For  those  cel,  there  is  an  »,,  >  0  such  that 

lo 


IoMI  >  "16llg^li,  i  <  i  <  n-i  ■ 
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In  the  following  there  are  three  propositions  I,  II,  and  III.  In 
the  modified  induction  proof  (modified  in  the  sense  that  the  propositions 
are  to  be  proved  not  for  all  the  integers,  just  for  a  subset),  superscripts 
(e.g.  I  )  will  be  used  to  indicate  the  integer  for  which  the  truth  of  the 
proposition  is  being  considered. 

The  three  propositions  are  as  follows. 

I:  <9c),sc  =  il9cl!  *  K*  *  ‘c’1’ 

wh.ss 


lun  (  =  0, 

c 

c  ~*oo 


(36) 


for  0  <  i  <  k;  k  =  1,  . .  . ,  n-1 


For  k  =  n, 


i«2i 


.n,  i 


wher 


lim  <n’1  =  0 

r* 

c— -oo 


(37) 


for  i  =  0,  . .  . ,  n-1. 


Tr  TTk  *  i  i  ,  .k,i 

II:  H  G  s  =  s  +  6  , 

c  c  c  c 


where 
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(38) 


where 


for  0  <  i  <  k$  k  a  1, 


n-1. 


(39) 


(The  relations  implied  by  III  are  a  generalization  of  those  given  by 
conjugate  directions  which  hold  when  f(x)  is  a  positive  definite  quadratic 
form ) . 

We  note  in  passing  that  in  arguments  using  limits  as  those 
stated  above,  if  the  conclusions  are  true  for  a  finite  collection  of  terms, 
they  are  true  for  their  sum.  This  fact  will  be  used  implicitly  in  the 
following  proof. 

Proof  for  k  =  1 

rl  .  10 

1  :  {gc),Sc  =  0  (from  (14)).  (40) 

Ill{  Hc®(t1c,sc  =Hcyc(tc,~1  (from  (9),  (17)). 

0 

=  sc  (from  (4),  (9))  .  (41) 


*101,2 
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Thus 


1*0  l-'OO  1#A0  0 

KcG  Sc=HcG{\)Sc+HcCG  “G(V]sc 


0  „lr  *  *  Ovi  0 

=  Sc+  VG  "  G<VJSc  ’ 


Dart  II*  follows  from  the  fact  that  [G*  -  G <  t-| ° )  ] 


”*■  0  as  c  <» 


and  ( 25) . 


1  1*0  i  *  o  0  1  *-*00 

III:  (s  )'G  s  =(s  I'GlOs  +ls^'tG  -G(iu)]s” 

'  c  c  c  'o  c  c  'c  c 


=  0  (using  (41)  then  (40)) 


1  *  *  0  0 
+  ( s  ) 1  [  G  -G(n  )]s  . 

c  c  c 

The  remainder  of  the  proof  follows  from  the  arguments  used  for  part  II*  . 


Assume  true  for  k.  prove  true  for  kl  1 


I  T  :  A,  Case  where  2  <  k  +  1  <  n  -  1.  ) 
k+1  k 

For  i  =  k,  (y  )'  s  =0  (from  (14)). 


For  0  <  i  <  k, 


i  1c  a  k  k  k  i 

(g  )'s*  =[g  +  G(il>  t J's  (Taylor’s  Theorem) 

Q  c  c  c  c  c  c 


.  k.  .  i  .  k.  .  *  hk  ,  .  k..r*.  k.  i.k 

=( 9_)  s  + (sJ'G  st  +  ( s  )  [ G( t|  ) ~G  ]st 


c  c  c  c  c  c 


The  induction  hypothesis  I  ,  with  (30)  and  (35)  takes  care  of  the  first 

k 

term  in  (44).  The  induction  hypothesis  III  with  (28),  (30)  and  (35)  lakes 

k+1 

care  of  the  second  term.  Part  I  for  the  third  term  is  trivial. 
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B.  Case  where  k  +  1  =  n 


Using  the  fact  that  f(  x11  S  <f(x^)  along  with  (30)  and  the  induction 

c  c 

hypotheses  yields  (37), 

(f  4-1  1 

II  :  For  i  *  k,  the  proof  is  identical  to  that  of  II  .  For  i  <  k, 

„k+l  *  i  i  uk  k,.  k,,tTk  k,-l,  k.  )tJk  *  i 

H  <3  s  =  H  G  s  “H  y  y  'H  y  (y)'HGs 
c  c  c  c  c  c  c  c  cJ  c  c  o 

(■ 

.  k, .  k.  k-,-1,  k.  *  i 
+  <r  c(  (o-a)  yc]  !<rc)G  sc  . 

The  induction  hypothesis  II^(  H '  s i  =  s*  +  8^’*)  takes  care  of  the 

c  c  c  c 

first  term  in  (45) .  Using  it  in  the  second  term  gives  two  more  quantities, 
the  second  of  which 


TIk  kr.  k.  ,  TTk  k,-l.  k,  .  ,.k,i 

■VollV  W  (1’o)  V 

k 

yields  the  desired  conclusion  because  the  magnitude  of  y  is  an 

c 

k 

independent  quantity,  (  24),  (  25),  and  the  induction  assumption  II  . 
For  the  first  quantity 


TIk  kr  .  k.  ,  TTk  k,  -1.  k  i 
-H0yc[(yc)'Hcyc)  (yc)'s0, 


the  c;  ssired  conclusion  will  follow  if 


{yc),sc  =  liyi 


I  i  II  ,  k,i 

II  *  u 
c  c 


where 


lim  \ik’1  =  0. 
„  o 

C~*00 
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Now 


<yk)V  [Taylor1  s  Theorem] 


^k.  k.,,-,#  1  .  ,k.  k,.r*.  k.  *,  i 

StC|S0)  G  V'W  [G<V  "G  ]SC 


(48) 


,  „,k 


Because  of  IT!  ,  and  the  continuity  of  G(x),  both  terms  have  the 
requirements  necessary  to  show  (46)  and  (4?)  hold.  The  conclusions 
for  the  third  term  in  (4D)  follows  similar  arguments.  This  completes  the 
proof  of  IIk+^  . 


k4*l 

III  :  For  i  =■>  k. 


(  .  J+S  .  G*s l  ,*>  s*  +  (  sk+1)  '  [  G*  -  G(  1$)  *1 


0  +  (sk+l)’fG*-6<i1k>]sk 


(49) 


The  usual  arguments  apply. 
For  i  <  k, 


( by  definition) 

«  (gh+1)'(sl  +  sk+1’1)  (by  IIk+1)  , 
c  C  c 

ye*)*!  \ 

which,  using  1  ,  and  the  property  on  6  1  completes  the  proof  of 

c 

IIIk+i . 
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We  shall  now  show  how  the  desired  conclusions  follow  from  I 
(eg,  ( 37) ) . 

First,  note  that  because  of  III  the  matrix 


,,c  _  ,  0  ||  0 1|  -I 

5  -  {s0  ||soll  , 


n-1  ||  n-ln-K 
sc  II  sc  II  } 


(50) 


for  all  c  large  has  an  inverse,  and 


lim  inf  | det(  S°)  |  >  0. 

Q  —oo 

This  follows  because  it  is  easy  to  show  from  III  that  any  limit  set  of 

vectors  in  the  matrix  $c  above  must  be  linearly  independent. 

Then,  using  (37)  and  solving  for  g11, 

c 


"-‘hs0)-1 

C 


(51) 


Since  the  «n,i'  s  all  have  property  (37),  for  case  (ii),  part  (c)  of 
the  theorem  follows  from  (26)  by  taking  the  norm  of  both  sides  of  (51) ,  Q.E.  D. 
An  obvious  corollary  of  this  theorem  foiiows  from  II, 


Corollary  1  (Convergence  to  the  inverse  Hessian] 

Under  the  assumptions  (1),  (2),  and  (3)  of  Theorem  2,  for  c  1 1^  , 


lim  ||  [  Hn  -  (G*)_1]s||  =  0,  for  all  s.  (52) 

c 

C  — 00 

If  a  Lipschitz  condition  is  placed  on  the  second  derivatives  of  f 
it  is  possible  to  show  that  the  rate  of  convergence  to  the  strict  local 
minimum  is  at  least  quadratic  for  certain  subsets  of  integers. 
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Theorem MJIL  [Quadratic  Rate  of  Convergence] 

If!  (I)  in  a  neighborhood  of  x  there  is  an  such  that  for  any  y,  z 
in  that  neighborhood,  for  any  j,  ( j  =  1,  . . n) 


l£  (92f(y)/0x  Ox  -  0Zt{z)/Bx  dx  ):i  \ 

1=1  1  J  1  J  A 

< —  *  II  s  ||  *  ll  y  -  2  ||,  [Lipschitz  Condition  on  second 
*'  derivatives  of  f] 

(2)  the  RDVMM  is  applied  to  problem  (1), 

* 

(3)  G  is  a  positive  definite  matrix, 


(53) 


them  (a)  There  is  an  «4  such  that  tor  c  large, 

II  x"  -  X* ||  <*4I|x£  -  x*||2  (54) 

3 

for  c  <  I  ,  any  ordered  set  of  integers  with  the  property  that 

II  a"'1  II 

lim  Inf  — r: -  =  6  >  0  .  (55) 

o-»  lo“ll 

T3 

0<I 

(The  qualification  on  the  set  of  Integers  for  which  "at  least 

quadratic"  convergence  can  bo  proved  is  needed  because  if  the  gradient 

of  f  "drops  an  order  of  magnitude"  during  a  group  of  iterations  before 

k  +1 

the  n  th  point,  the  induction  step  I  (see  equation  (44))  fails. 


This  drop  in  magnitude  contributes  to  the  superlinear  convergence 
(see  Theorem  2) ,  but  it  may  not  be  as  high  as  quadratic. ) 


Proofi  The  proof  of  this  theorom  uses  many  results  of  the  proof  of 

Theorem  2.  To  avoid  duplication,  results  from  that  theorem  will  be  used. 

3 

First,  wa  note  that  case  (il)  obtains  since  I  is  a  Bet  of  integers  being 

2 

the  same  properties  as  I  (compare(13)  and  (55)). 

As  in  Theorem  2,  three  propositions  need  to  be  proved. 

ll^ll  '  where 


I: 


o  o 


i  h, i |  k ,  i  n  0  ^ii 

K  I  £«  ’  Hx  -  *  H  » 


llm  sup  ok>  1  <  +  » . 
o  -*-oo  c 


for  0  <  i  <  k;  k  =  1,  . . . ,  n-1. 
For  k  =n, 


(06) 

(5?) 


(«")'<=  K»‘  KK'l> 


i  ii  n.l 


(58) 


E 


where  have  the  properties  implied  by  (56)  and  (57). 

c 


if  i  i  k .  1 

II*  H-G  s*  =s*+  fj  »  , 


where 


<„Mn  - 

- s —  <  (3  ’  II X  -  X 

II  i  II  C  c 


0  *  I 


where 


c-*-  oo 

for  0  <  i  <  k;  k  =  1,  ,  ,  ,  ,  n-1  . 


lim  sup  *  <  +  oo  . 
c 
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im 


k  *  i 

(9c)'G  8q 


V 


M 

c 


where 


k,  i  |  .  k ,  i  it 
Y_  I  <  «  II  x 


x 


II 


where 


i 

lim  sup  w  '  <  +  oo  , 
c~*  <» 

for  0  <  l  <  k;  k  =  1,  . . . ,  n~l. 

Analysis  of  these  statements  and  a  comparison  with  those  of 

Theorem  2  show  that  the  difference  is  that  the  terms  that  vanish  in 

Theorem  2  are  said  to  vanish  (roughly)  at  the  rate  that  flx„  -  x  ||  -*0. 

c 

To  write  out  the  complete  proof  would  duplicate  most  of  the  proof 
of  Theorem  2.  It  shail  be  sufficient  to  analyze  that  vanishing  quantity  of 
equations  (44)  which  is  not  involved  with  the  induction  hypothesis, 

Thus 


(using  (15),  (16),  and  (53)). 


18 
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For  j  =  1,  .  . n  it  follows  from  (18)  and  (<>?)  that 


i  k, j  ^ |i  ii  k  «, 

hc  -  X  II  <  oglixc  -  x  | 


Using  this  in  ( S9)  tha  chain  of  inequalities  continues  as 

<  llsjll  •  llsj.ll  .  oa  ■  llxj-x*!  .  t* 

<  llsk||  •  llsjll  •  ni  -  a'g  '  lix^  -  x*|| 

(using  the  fact  that  f(xk)  <  f(x°)  with  (27),  and  (28)), 

c  c 

Thus  (60)  is  of  the  form  required  in  (56)  and  (5?),  q.  E.D. 


Under  the  assumptions  of  Theorem  3,  there  is  an  such  that  for 


c  large,  and  c « I  . 


H"  -  (GV^zll  <  o18  *  llx°-x*||*  || z | 


for  all  z . 


Proof!  The  proof  is  similar  to  that  of  Corollary  1. 

The  Important  observation  about  all  this  is  that  Theorems  2  and  3 
on  the  rate  of  convergence  would  also  apply  if  the  resetting  occurred  at 
the  nth  and  not  the  ( n-i l)tli  point.  That  is,  if  (10)  were  replaced  with 
"(krl)s  o  mod!n)"  instead  of  ''(k+l)  *  0  mod(ntl),"  This  emphasizes  the 
tentative  conclusion  of  McCormick  and  Pearson  [  3)  that  the  rate  of  convergence 


I 

h 


i"  - 


l 
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of  Davidon's  Variable  Metric  Method  depends  on  its  conjugate  direction 
properties  not  on  the  fact  that  if  corollary  1  holds  it  is  also  a  quasi- 
Newton  method. 

For  the  original  DVMM  Powell  [  5]  has  shown  that  convergence  to 
a  stationary  point  is  guaranteed  when  the  function  to  be  minimized  has  a 
Hessian  Matrix  whose  eigenvalues  are  bounded  below  away  from  zero.  In 
Theorem  1  it  was  shown  that  the  RVMM  converges  when  just  the  continuity 
of  the  first  derivatives  is  required.  There  is  experimental  evidence  in 
McCormick  and  Pearson  [  3 ] ,  to  indicate  that  without  the  resetting  feature, 
the  DVMM  can  fail  to  converge  for  a  nonconvex  function. 

In  [  5]  Powell  showed  that  the  rate  of  convergence  of  the  DVMM  is 
every  step  superlinear  if  the  second  derivatives  of  f  are  Lipschitzian. 
Under  the  same  assumption  in  Theorem  3  it  was  shown  that  the  RVMM  could 
be  expected  to  exhibit  a  quadratic  rate  of  convergence  every  n  steps. 
Furthermore,  with  just  the  assumption  that  the  eigenvalues  of  Vf  be 
bounded  below  away  from  zero,  the  RVMM  has  n-step  superlinear  conver¬ 
gence.  In  the  first  case  it  seems  reasonable  that  an  every  step  superlinear 
rate  of  convergence  would  be  better  than  n-step  quadratic  rate.  There  is 
currently  no  theoretical  analysis  of  this  statement. 
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